No penalty no tears: Least squares in high-dimensional linear models
نویسندگان
چکیده
Ordinary least squares (OLS) is the default method for fitting linear models, but is not applicable for problems with dimensionality larger than the sample size. For these problems, we advocate the use of a generalized version of OLS motivated by ridge regression, and propose two novel three-step algorithms involving least squares fitting and hard thresholding. The algorithms are methodologically simple to understand intuitively, computationally easy to implement efficiently, and theoretically appealing for choosing models consistently. Numerical exercises comparing our methods with penalizationbased approaches in simulations and data analyses illustrate the great potential of the proposed algorithms.
منابع مشابه
No penalty no tears : Least squares in high - dimensional linear models - Supplementary materials
متن کامل
High - Dimensional Generalized Linear Models and the Lasso
We consider high-dimensional generalized linear models with Lipschitz loss functions, and prove a nonasymptotic oracle inequality for the empirical risk minimizer with Lasso penalty. The penalty is based on the coefficients in the linear predictor, after normalization with the empirical norm. The examples include logistic regression, density estimation and classification with hinge loss. Least ...
متن کاملUsing an Efficient Penalty Method for Solving Linear Least Square Problem with Nonlinear Constraints
In this paper, we use a penalty method for solving the linear least squares problem with nonlinear constraints. In each iteration of penalty methods for solving the problem, the calculation of projected Hessian matrix is required. Given that the objective function is linear least squares, projected Hessian matrix of the penalty function consists of two parts that the exact amount of a part of i...
متن کاملRobust high-dimensional semiparametric regression using optimized differencing method applied to the vitamin B2 production data
Background and purpose: By evolving science, knowledge, and technology, we deal with high-dimensional data in which the number of predictors may considerably exceed the sample size. The main problems with high-dimensional data are the estimation of the coefficients and interpretation. For high-dimension problems, classical methods are not reliable because of a large number of predictor variable...
متن کامل